LSTM can Solve Hard Long Time Lag Problems
نویسندگان
چکیده
Standard recurrent nets cannot deal with long minimal time lags between relevant signals. Several recent NIPS papers propose alternative methods. We rst show: problems used to promote various previous algorithms can be solved more quickly by random weight guessing than by the proposed algorithms. We then use LSTM, our own recent algorithm, to solve a hard problem that can neither be quickly solved by random search nor by any other recurrent net algorithm we are aware of. Traditional recurrent nets fail in case of long minimal time lags between input signals and corresponding error signals 7, 3]. Many recent papers propose alternative methods, e.g., 16, 12, 1, 5, 9]. For instance, Bengio et al. investigate methods such as simulated annealing, multi-grid random search, time-weighted pseudo-Newton optimization, and discrete error propagation 3]. They also propose an EM approach 1]. Quite a few papers use variants of the \2-sequence problem" (and \latch problem") to show the proposed algorithm's superiority, e.g. 3, 1, 5, 9]. Some papers also use the \parity problem", e.g., 3, 1]. Some of Tomita's 18] grammars are also often used as benchmark problems for recurrent nets 2, 19, 14, 11]. Trivial versus non-trivial tasks. By our deenition, a \trivial" task is one that can be solved quickly by random search (RS) in weight space. RS works as follows: REPEAT randomly initialize the weights and test the resulting net on a training set UNTIL solution found.
منابع مشابه
Bridging Long Time
Numerous recent papers (including many NIPS papers) focus on standard recurrent nets' inability to deal with long time lags between relevant input signals and teacher signals. Rather sophisticated, alternative methods were proposed. We rst show: problems used to promote certain algorithms in numerous previous papers can be solved more quickly by random weight guessing than by the proposed algor...
متن کامل3 Long Short Term
\Recurrent backprop" for learning to store information over extended time periods takes too long. The main reason is insuucient, decaying error back ow. We describe a novel, ee-cient \Long Short Term Memory" (LSTM) that overcomes this and related problems. Unlike previous approaches, LSTM can learn to bridge arbitrary time lags by enforcing constant error ow. Using gradient descent, LSTM explic...
متن کاملParallelizing Assignment Problem with DNA Strands
Background:Many problems of combinatorial optimization, which are solvable only in exponential time, are known to be Non-Deterministic Polynomial hard (NP-hard). With the advent of parallel machines, new opportunities have been emerged to develop the effective solutions for NP-hard problems. However, solving these problems in polynomial time needs massive parallel machines and ...
متن کاملImage Captioning with Sparse Lstm
Long Short-Term Memory (LSTM) is widely used to solve sequence modeling problems, for example, image captioning. We found the LSTM cells are heavily redundant. We adopt network pruning to reduce the redundancy of LSTM and introduce sparsity as new regularization to reduce overfitting. We can achieve better performance than the dense baseline while reducing the total number of parameters in LSTM...
متن کاملPrediction of Covid-19 Prevalence and Fatality Rates in Iran Using Long Short-Term Memory Neural Network
Introduction: The rapid spread of COVID-19 has become a critical threat to the world. So far, millions of people worldwide have been infected with the disease. The Covid-19 pandemic has had significant effects on various aspects of human life. Currently, prediction of the virus's spread is essential in order to be safe and make necessary arrangements. It can help control the rate of its outbrea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996